\subsection{Analysis of Dataset}
We start by presenting the insights gained by analyzing the dataset. 
Each floor in the MIT floorplan dataset consists of a set of \textit{spaces} and their connections to other spaces. Floors can be represented as graphs; the spaces can be interpreted as vertices of a graph and the connections as edges in this graph~\cite{Whiting2007topology}. A space can be a room surrounded with walls and accessible via doors, but sometimes a space can also have invisible boundaries, e.g. a coffee shop at the end of a corridor. 

Connector spaces such as $corridor$ and $stair$ are crucial parts of any indoor environment since they act as indoor highways. Our intuition tells us that spaces that has the functionality to connect other rooms and floors together should appear with high frequency in the natural indoor environments. Table~\ref{table:freqnodes} shows most frequent vertices in the MIT floorplan dataset with their occurance frequency in all floors. 
As can be seen, $corridor$, $stair$ are in almost every floor, ranking as the top two most occuring spaces. Offices are also a common spaces in campus buildings. 

Furthermore, we would expect to see some common patterns in floorplans. As an example, we would expect to see certain facilities such as lavatories and elevators to be at easily reachable locations, well connected to frequently used spaces such as office rooms. This supports the hypotheses that indoor topologies consist of commonly occurent smaller parts. Figure~\ref{fig:FreqSubgraphs} shows most frequent subgraphs in the dataset for graph sizes of 3, 4 and 5. It is remarkable that even for large graph sizes 4 and 5 vertices, certain patterns are commonplace in the dataset. 

\begin{figure*}
  \centering
  \subfloat[]{\label{fig:FS31}\includegraphics[scale=0.4]{figures/FS/FSSize31}}                
  \hspace{2mm}
  \subfloat[]{\label{fig:FS32}\includegraphics[scale=0.4]{figures/FS/FSSize32}}
  \hspace{2mm}
  \subfloat[]{\label{fig:FS33}\includegraphics[scale=0.4]{figures/FS/FSSize33}} \\
  \subfloat[]{\label{fig:FS41}\includegraphics[scale=0.4]{figures/FS/FSSize41}}                
  \hspace{2mm}
  \subfloat[]{\label{fig:FS42}\includegraphics[scale=0.4]{figures/FS/FSSize42}}
  \hspace{2mm}
  \subfloat[]{\label{fig:FS43}\includegraphics[scale=0.4]{figures/FS/FSSize43}} \\
  \subfloat[]{\label{fig:FS51}\includegraphics[scale=0.4]{figures/FS/FSSize51}}                
  \hspace{2mm}
  \subfloat[]{\label{fig:FS52}\includegraphics[scale=0.4]{figures/FS/FSSize52}}
  \hspace{2mm}
  \subfloat[]{\label{fig:FS53}\includegraphics[scale=0.4]{figures/FS/FSSize53}}
  \caption{The three most common frequent subgraphs for graph sizes 3, 4 and 5. The frequencies for subgraphs shown in figures~\ref{fig:FS31}-~\ref{fig:FS33} are
  $37.66,\: 37.11,\: 36.56$, for figures ~\ref{fig:FS41}-~\ref{fig:FS43} they are $26.50,\: 25.04,\: 25.04$ and finally for figures ~\ref{fig:FS51}-~\ref{fig:FS53} they correspond to $17.18,\: 17.00,\: 17.00$, respectively.}
  \label{fig:FreqSubgraphs}
\end{figure*}

\begin{figure*}
  \centering
  \subfloat[]{\label{fig:CorrMatrix}\includegraphics[scale=0.6]{figures/corrMatrix2}}                
  \hspace{2mm}
  \subfloat[]{\label{fig:HighCorrFS1}\includegraphics[scale=0.7]{figures/HighCorrFS1}}
  \hspace{2mm}
  \subfloat[]{\label{fig:HighCorrFS2}\includegraphics[scale=0.7]{figures/HighCorrFS2}}
  \caption{Figure ~\ref{fig:CorrMatrix} shows a plot of the Pearson's correlation coefficient~\cite{FreqSubgraphPearson} (explained in section~\ref{sec:preliminaries} for all the frequent subgraphs in the dataset which are occuring in more than 16\% of all graphs ($\theta=0.16$). The graphs are ordered such that the top left pixel is the most frequently occuring subgraph and the top right pixel corresponds to the least frequent, satisfying a minimum threshold of 16\%.
Each pixel represents a frequent subgraph pair and brightness corresponds to high correlations. As an example, Figure ~\ref{fig:HighCorrFS1} and ~\ref{fig:HighCorrFS2} corresponds to pixel (19,12) or (12,19) and is the highest correlated pair found in this set. The correlation between these two frequent subgraphs is 0.97, which is extremely high. Having observed for example the graph in figure ~\ref{fig:HighCorrFS1}, we could say that the edit operation leading to 
the graph in figure ~\ref{fig:HighCorrFS2} is very probable. The edit operation would be to add an edge between ``OFF'' and ``P CIRC''.} 
  \label{fig:Corrs}
\end{figure*}

    \begin{table}
\begin{center}
\caption{Most commonplace spaces in the dataset and their frequencies. Here ``JAN CL'', ``ELEC'', ``OFF SV'' are abbreviations for janitor closet, electricity cabinet and office service respectively.}
    \label{table:freqnodes}
    \begin{tabular}{ | l | l |}
    \hline
    Vertex & Frequency \\ \hline
    STAIR & $85\%$ \\
    CORR & $78\%$ \\
    OFF & $67\%$ \\
    OFF SV & $60\%$ \\
    ELEC & $60\%$ \\
    JAN CL & $57\%$ \\
    LOBBY & $48\%$ \\
    \hline
    \end{tabular}
\end{center}
    \end{table}

